-
-
Couldn't load subscription status.
- Fork 33.3k
gh-139871: Add bytearray.take_bytes([n]) to efficiently extract bytes
#140128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This sets up so the bytes can be "taken" as a byes object without requiring a copy. I ran pyperformance (results below) and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine. ------ pyperformance compare main.json bytearray_bytes.json -O table main.json ========= Performance version: 1.11.0 Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42 Number of logical CPUs: 32 Start date: 2025-10-14 00:55:52.519236 End date: 2025-10-14 02:23:01.308400 bytearray_bytes.json ==================== Performance version: 1.11.0 Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42 Number of logical CPUs: 32 Start date: 2025-10-13 23:22:29.928152 End date: 2025-10-14 00:49:34.467284 +----------------------------------+-----------+----------------------+--------------+------------------------+ | Benchmark | main.json | bytearray_bytes.json | Change | Significance | +==================================+===========+======================+==============+========================+ | 2to3 | 137 ms | 136 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_generators | 193 ms | 195 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_cpu_io_mixed | 285 ms | 286 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_cpu_io_mixed_tg | 289 ms | 290 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager | 50.4 ms | 51.5 ms | 1.02x slower | Significant (t=-10.40) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_cpu_io_mixed | 223 ms | 225 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_cpu_io_mixed_tg | 263 ms | 264 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_io | 370 ms | 372 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_io_tg | 380 ms | 384 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_memoization | 125 ms | 126 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_memoization_tg | 161 ms | 162 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_tg | 125 ms | 125 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_io | 366 ms | 360 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_io_tg | 359 ms | 361 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_memoization | 177 ms | 181 ms | 1.02x slower | Significant (t=-9.20) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_memoization_tg | 188 ms | 189 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_none | 151 ms | 151 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_none_tg | 150 ms | 151 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_tcp | 182 ms | 161 ms | 1.13x faster | Significant (t=32.85) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_tcp_ssl | 548 ms | 553 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_websockets | 342 ms | 339 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bench_mp_pool | 7.12 ms | 7.08 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bench_thread_pool | 818 us | 819 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bpe_tokeniser | 2.10 sec | 2.09 sec | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | chaos | 27.9 ms | 28.0 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | comprehensions | 7.45 us | 7.24 us | 1.03x faster | Significant (t=3.27) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | connected_components | 308 ms | 309 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | coroutines | 11.1 ms | 11.2 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | coverage | 33.6 ms | 34.1 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | create_gc_cycles | 1.16 ms | 1.16 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | crypto_pyaes | 37.1 ms | 35.6 ms | 1.04x faster | Significant (t=10.63) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | dask | 347 ms | 351 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy | 118 us | 117 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy_memo | 12.8 us | 12.7 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy_reduce | 1.32 us | 1.34 us | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deltablue | 1.65 ms | 1.64 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | django_template | 17.9 ms | 17.8 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | docutils | 1.19 sec | 1.20 sec | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | dulwich_log | 19.5 ms | 19.7 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | fannkuch | 184 ms | 181 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | float | 37.1 ms | 36.7 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | gc_traversal | 3.04 ms | 2.84 ms | 1.07x faster | Significant (t=19.48) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | generators | 15.9 ms | 15.3 ms | 1.04x faster | Significant (t=7.03) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | genshi_text | 11.3 ms | 11.2 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | genshi_xml | 25.5 ms | 25.5 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | go | 57.6 ms | 56.7 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | hexiom | 2.92 ms | 2.88 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | html5lib | 26.0 ms | 26.5 ms | 1.02x slower | Significant (t=-9.20) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | json_dumps | 4.48 ms | 4.44 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | json_loads | 11.7 us | 11.7 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | k_core | 1.41 sec | 1.42 sec | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_format | 3.27 us | 3.30 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_silent | 45.5 ns | 45.8 ns | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_simple | 3.02 us | 3.01 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | mako | 6.02 ms | 6.03 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | many_optionals | 473 us | 478 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | mdp | 587 ms | 578 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | meteor_contest | 50.2 ms | 50.5 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | nbody | 54.6 ms | 52.4 ms | 1.04x faster | Significant (t=10.72) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | nqueens | 41.7 ms | 40.4 ms | 1.03x faster | Significant (t=6.79) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pathlib | 9.77 ms | 9.73 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle | 5.99 us | 6.01 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_dict | 12.5 us | 12.8 us | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_list | 1.98 us | 1.96 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_pure_python | 149 us | 150 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pidigits | 111 ms | 115 ms | 1.03x slower | Significant (t=-18.53) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pprint_pformat | 737 ms | 748 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pprint_safe_repr | 362 ms | 369 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pyflate | 211 ms | 205 ms | 1.03x faster | Significant (t=7.43) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | python_startup | 7.88 ms | 7.88 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | python_startup_no_site | 4.72 ms | 4.76 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | raytrace | 130 ms | 128 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_compile | 50.0 ms | 50.2 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_dna | 101 ms | 103 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_effbot | 1.72 ms | 1.77 ms | 1.03x slower | Significant (t=-26.42) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_v8 | 12.5 ms | 12.3 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | richards | 20.4 ms | 20.0 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | richards_super | 23.4 ms | 22.8 ms | 1.03x faster | Significant (t=11.36) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_fft | 154 ms | 153 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_lu | 55.4 ms | 57.0 ms | 1.03x slower | Significant (t=-5.67) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_monte_carlo | 32.8 ms | 32.8 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_sor | 57.8 ms | 56.9 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_sparse_mat_mult | 2.75 ms | 2.76 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | shortest_path | 316 ms | 318 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | spectral_norm | 47.7 ms | 51.6 ms | 1.08x slower | Significant (t=-2.01) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sphinx | 465 ms | 467 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_normalize | 50.3 ms | 50.2 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_optimize | 24.2 ms | 24.4 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_parse | 576 us | 572 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_transpile | 724 us | 722 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlite_synth | 1.14 us | 1.15 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | subparsers | 20.6 ms | 20.7 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_expand | 181 ms | 184 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_integrate | 8.54 ms | 8.55 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_str | 103 ms | 105 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_sum | 55.9 ms | 56.0 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | telco | 3.39 ms | 3.34 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | tomli_loads | 971 ms | 982 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | typing_runtime_protocols | 73.2 us | 73.6 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpack_sequence | 25.2 ns | 23.0 ns | 1.10x faster | Significant (t=7.03) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle | 6.99 us | 7.05 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle_list | 2.07 us | 2.10 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle_pure_python | 105 us | 104 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_generate | 40.5 ms | 40.7 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_iterparse | 49.7 ms | 50.4 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_parse | 77.2 ms | 79.1 ms | 1.02x slower | Significant (t=-16.14) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_process | 29.5 ms | 29.8 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+
bytearray.take_bytes([n]) to efficiently extract bytes
bytearray.take_bytes([n]) to efficiently extract bytesbytearray.take_bytes([n]) to efficiently extract bytes
Misc/NEWS.d/next/Core_and_Builtins/2025-10-14-18-24-16.gh-issue-139871.SWtuUz.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Victor Stinner <[email protected]>
|
Threading tests found a non-threading issue that after this change |
Co-authored-by: Maurycy Pawłowski-Wieroński <[email protected]>
… which is resulting in threading failure
Co-authored-by: Maurycy Pawłowski-Wieroński <[email protected]>
Objects/bytearrayobject.c
Outdated
| return PyLong_FromSsize_t(FT_ATOMIC_LOAD_SSIZE_RELAXED(self->ob_alloc)); | ||
| Py_ssize_t alloc = FT_ATOMIC_LOAD_SSIZE_RELAXED(self->ob_alloc); | ||
| if (alloc > 0) { | ||
| alloc += sizeof(PyBytesObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding in the size of PyBytesObject here (and in sizeof) because ob_alloc is expected by code to be the number of bytes of space available (vs ob_size, the number of bytes in use). Felt more straightforward to me to leave ob_alloc and ob_size definitions as they were and rather add in to the size reporting here.
|
@vstinner I think this is ready for another pass; I left github comments around some places I am unsure the CPython standard way to do as well as ones where I'm not sure what the right decision is |
Co-authored-by: Victor Stinner <[email protected]>
1. After __init__ or C construction guarantee ob_bytes_object is set by using empty bytes object. 2. In resize place a null terminator mid-buffer only if required 3. Remove now unneded branches - n == PY_SSIZE_T_MAX checks are redundant with resize checks. - size = 0 is handled by PyBytes_FromStringAndSize - No more alloc + 1; exact resize is exact and bytes does +1 for null - No downsize to 0 special case since alloc == size there.
| if (size == 0) { | ||
| new->ob_bytes = NULL; | ||
| alloc = 0; | ||
| new->ob_bytes_object = PyBytes_FromStringAndSize(NULL, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this now always gets set, but if size=0 then PyBytes_FromStringAndSize doesn't actually allocate / returns the empty bytes object so the optimization, don't allocate when zero-sized, is kept.
Lines 139 to 154 in 0f0a362
| PyBytes_FromStringAndSize(const char *str, Py_ssize_t size) | |
| { | |
| PyBytesObject *op; | |
| if (size < 0) { | |
| PyErr_SetString(PyExc_SystemError, | |
| "Negative size passed to PyBytes_FromStringAndSize"); | |
| return NULL; | |
| } | |
| if (size == 1 && str != NULL) { | |
| op = CHARACTER(*str & 255); | |
| assert(_Py_IsImmortal(op)); | |
| return (PyObject *)op; | |
| } | |
| if (size == 0) { | |
| return bytes_get_empty(); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to mention that the object is the empty bytes string singleton if size=0.
Objects/bytearrayobject.c
Outdated
|
|
||
| /* Prevent buffer overflow when setting alloc to size+1. */ | ||
| /* Prevent buffer overflow when setting alloc to size. */ | ||
| if (size == PY_SSIZE_T_MAX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this test, PyBytes_FromStringAndSize() has a stricter test on size maximum value:
if ((size_t)size > (size_t)PY_SSIZE_T_MAX - PyBytesObject_SIZE) {
PyErr_SetString(PyExc_OverflowError,
"byte string is too large");
return NULL;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay if this starts raising OverflowError instead of MemoryError? MemoryError is tested for in test_bytes but would simplify a number of cases if can always rely on the bytes to do the length check.
bytes, because the PyObject_HEAD is inline, has a slightly lower max length than bytearray which did a PyMem_Malloc. That means max bytearray moves from PY_SSIZE_T_MAX - 1 (where need to worry about overflowing a Py_ssize_t more often) to PY_SSIZE_T_MAX - PyBytesObject_SIZE. That's also what leads to + 1 checks no longer being needed because the "bytes" will always fail / overflow before we'd wrap around with a + 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added PyByteArray_SIZE_MAX to simplify these checks
| if (size == 0) { | ||
| new->ob_bytes = NULL; | ||
| alloc = 0; | ||
| new->ob_bytes_object = PyBytes_FromStringAndSize(NULL, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to mention that the object is the empty bytes string singleton if size=0.
Objects/bytearrayobject.c
Outdated
| } | ||
| if (alloc > PY_SSIZE_T_MAX) { | ||
| // NOTE: offsetof() logic copied from PyBytesObject_SIZE in bytesobject.c | ||
| if (alloc > PY_SSIZE_T_MAX - (offsetof(PyBytesObject, ob_sval) + 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move PyBytesObject_SIZE from bytesobject.c to pycore_bytesobject.h and rename it as _PyBytesObject_SIZE? Then add #define PyBytesObject_SIZE _PyBytesObject_SIZE to bytesobject.c (to avoid modifying the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved, also added a PyByteArray_SIZE_MAX which generally replaces PY_SSIZE_T_MAX to make the checks more precise (and found one case which was still doing a + 1 / - 1 that doesn't need to anymore.
| PyErr_SetString(PyExc_OverflowError, | ||
| "cannot add more objects to bytearray"); | ||
| return NULL; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot remove this check, there is n+1 just below which can overflow, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new constant, this passes when I add it locally:
static_assert(PyByteArray_SIZE_MAX + 1 < PY_SSIZE_T_MAX, "Py_SIZE(self) + 1 code may overflow");
(there's a static_assert in the new JIT, but I don't see one anywhere else / don't know a good place to add that if we want it inside the codebase)
| PyErr_SetString(PyExc_OverflowError, | ||
| "cannot add more objects to bytearray"); | ||
| return NULL; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot remove this check, there is n+1 just below which can overflow, no?
|
With a little more tweaking can rely on |
Update
bytearrayto contain abytesand provide a zero-copy path to "extract" thebytes. This allows making several code paths more efficient.This does not move any codepaths to make use of this new API. The documentation changes include common code patterns which can be made more efficient with this API.
When just changing
bytearrayto containbytesI ran pyperformance on a--with-lto --enable-optimizations --with-static-libpythonbuild (results below) and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine (Generally changes under 5% or benchmarks that don't touch bytes/bytearray).pyperformance compare main.json bytearray_bytes.json
main.json
Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-14 00:55:52.519236
End date: 2025-10-14 02:23:01.308400
bytearray_bytes.json
Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-13 23:22:29.928152
End date: 2025-10-14 00:49:34.467284
.take_bytes([n])a zero-copy path frombytearraytobytes#139871